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Abstract 



We propose a new coding technique based on nested coset codes and derive a new achievable 
rate region for a general three user discrete memoryless broadcast channel (DMBC). We identify an 
example of a three user binary broadcast channel for which the proposed achievable rate region strictly 
, outperforms that obtained by a natural extension of Marton's [I] rate region. As a step towards 

deriving the achievable rate region for the general three user DMBC, we introduce the new elements 
■ of our coding theorem through a new class of broadcast channels called 3-to-l broadcast channels. 

C\| I 1 Introduction 

The problem of characterizing the capacity region of a broadcast channel was proposed by Cover [2] in 
1972, and he introduced a novel coding technique to derive achievable rate regions for particular degraded 
broadcast channels. In a seminal work aimed at deriving an achievable rate region for the general de- 

03 . 

graded broadcast channel, Bergmans [3] generalized Cover's technique into what is currently referred to 
as superposition coding. Gallager [3] and Bergmans [5] concurrently and independently proved optimality 
of superposition coding for the class of degraded broadcast channels. This in particular yielded capacity 
region for the scalar additive Gaussian broadcast channel. However, the case of general discrete memory- 
less broadcast channel (DMBC) remained open. This (problem) led to the discovery of another ingenious 
coding techniqucQ. In 1979, following the works of [SHU, Marton [TJ proposed the technique of binning. In 
conjunction with superposition, she derived the largest known rate region for the general two user DMBC. 
A generalization J5J p. 391 Problem 10(c)] of superposition and binning to incorporate a common message 
is the largest known rate region for the general DMBC and its capacity is yet unknown^ 



"This work was supported by NSF grants CCF-0915619 and CCF-1116021. 
1 We remark that general DMBC is richer in terms of the strategies it permits. 

2 It is of interest to note that though superposition and binning were known in particular settings [2], [9], its generalization 
led to fundamentally new ideas. For example, the description of a rate region using an auxiliary random variable [3] and the 
technique of binning have proved to be invaluable in deriving information theoretic achievable rate regions. 
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Though the capacity region has been found for many interesting classes of broadcast channels Q][2J 
[T0lJ20| . the question of whether the rate region derived by Marton is optimal for the general DMBC 
has remained open for over thirty years. Following a period of reduced activity, there has been renewed 
interest [211122] in settling this question. Gohari and Anantharam [23| have proved computability of 
Marton's rate region. This enabled them identify a class of binary broadcast channels for which Marton's [Tj 
rate region when computed is strictly smaller than the tightest known outer bound |24[|25| . which is due 
to Nair and El Gamal. On the other hand, Wcingartcn, Steinberg and Shamai [21] have proved Marton's 
binning (also referred to, in the Gaussian setting, as Costa's dirty paper coding [57]) to be optimal for 
Gaussian MIMO broadcast channel, and thereby characterized capacity region for the particular class of 
Gaussian vector broadcast channels. It is of interest to note the optimality of Marton's binning technique 
for Gaussian vector broadcast channels with arbitrary number of receivers. In this article, we (1) propose a 
new coding technique based on structured codes that enables us to (2) derive a new achievable rate region 
for the general three user discrete broadcast channel, and thereby (3) provide a strict enlargement of the 
current known largest achievable rate regiorJ§. In addition to providing a new coding technique and a new 
achievable rate region, we provide an example of a binary additive three user broadcast channel for which 
the proposed rate region contains a triple of rates not achievable using Marton's technique. Indeed, one of 
the key elements of our work is an analytical proof of suboptimality of Marton's rate region for the three 
user broadcast. 

This framework proposed herein is based on our earlier work on the distributed source coding and 3- user 
interference channels in the discrete memoryless setting [23123 - The coding technique proposed herein is 
reminiscent of that proposed for the general three user interference channel. 

While at first glance, it appears that the gains we project are similar to that harnessed in the interference 
channel, wc opine that the phenomenon that is exploited here is fundamentally different. In a two user 
broadcast channel, signals intended for a user interfere with signals intended for the other. The two coding 
techniques - superposition and binning - exemplify the two ways interference can be tackled. Firstly, 
superposition enables each user decode one component of the other user's signal and thus subtract it off. 
Secondly, binning enables the encoder counter each user's interfering signal not decoded by the other by 
precoding for the same. Except for particular cases, the most popular being dirty paper coding, precoding 
results in a rate loss, i.e., in other words, precoding at the encoder is less efficient that decoding the 
interfering signal at the decoder. The presence of a rate loss motivates each decoder to decode as large a 
part as possible of the interference pattern^ However decoding a large part of the interference constrains 
the individual rates. In a three user broadcast channel, each user's reception is plagued by interference 
caused by signals intended for the other two users. The interference is in general a bi-variate function of 
signals intended for the other users. If the signals of the two users are endowed with structure that can help 
compress the range of this bi-variate function when applied to all possible signals, then the receivers can 
decode a large part of the interfering signal. This minimizes the component of the interference precoded, and 
therefore the rate loss[f| This is where codebooks endowed with algebraic structure outperform unstructured 

3 The largest known achievable rate region for the general three user discrete broadcast channel is the natural extension of 
Marton's rate region for the two user case. We henceforth refer to this as Marton's rate region for three user DMBC 

4 For the Gaussian case, there is no rate loss. Thus the encoder can precode all the interference. Indeed, the optimal 
strategy does not require any user to decode a part of signal not intended for it. 

5 For the Gaussian case, precoding suffers no rate loss and hence no part of the interference needs to be decoded. Thus 



2 



independent codebooks. Indeed, linear codes constrain the interference pattern to an affinc subspacc if the 
interference is the sum of user 2 and 3's signals. It is our belief that additional degrees of freedom prevalent 
in a three user information theoretic problem can be harnessed with codebooks endowed with algebraic 
structure. Whether structure in codebooks can be exploited for a two user problem remains open. 

The astute reader will question the case when the bi-variate function is not a sum. Towards answering 
this question, we consider a natural generalization of linear codes to sets with looser algebraic structure 
such as groups. Our investigation of group codes, kernels of group homomorphisms, to improve achievable 
rate regions for information theoretic problems has been pursued in a concurrent research thread 

The role of structured codes for improving information theoretic rate regions began with the ingenious 
technique of Korner and Marton [32] proposed for the source coding problem of computing modulo two 
sum of distributed binary sources. Han and Kobayashi |33j categorized a class of function reconstruction 
problems for which Korner and Martons technique provided strict gains over the largest known rate regions 
using unstructured codes. Ahlswede and Han |34j proposed a universal coding technique that brings 
together coding techniques based on unstructured and structured codes0. More recently, there is a wider 
interest 35 137] in developing coding techniques for particular problem instances that does better than the 
best known techniques based on unstructured codes. It was shown in [33] . in the setting of distributed 
source coding that for every any non-trivial and truly bi-variate function, there exists at least one source 
distribution for which linear codes outperform random codes, Even then, it was largely believed that 
codebooks possessing algebraic structure can be exploited only for modulo additive channel and source 
coding problems. Indeed, linear codes were known to be sub optimal for communicating over arbitrary 
point to point channels (and similarly for lossy compression of sources subject to an arbitrary distortion) , 
and therefore, the basic building block in the coding scheme for any multi-terminal communication problem 
could not be filled by linear codes. For over thirty years, since the work of Korner and Marton came to 
light, neither did we know of a coding technique based on unstructured codes that did as well, nor did we 
know of a framework for coding based on structured codes for which the above findings was a particular 
case. 

Krithivasan and Pradhan [55] have proposed the ensemble of nested coset codes as the basic building 
block of algebraic codes for compressing sources subject to any arbitrary distortion. They employ this 
ensemble to propose a framework for communicating information from distributed encoders observing 
correlated sources to a centralized decoder. Firstly, this framework generalizes the technique proposed 
by Korner and Marton for the general problem of distributed function computation, joint quantization of 
distributed sources etc. Secondly, in conjunction with the Berger Tung |38] technique this strictly enlarges 

constraining interference patterns is superfluous. This explains why lattices are not necessary to achieve capacity of Gaussian 
vector broadcast channel. 

6 We also bring to the attention of the interested reader, our investigation 1311 of pseudo group codes. While linear codes are 
completely compressive under the operation of addition, and unstructured independent codes are completely explosive, pseudo 
group codes lie in between. In other words, when two pseudo group codes of rate R are operated under the group operation, 
the range of the resulting codebook lies between R and 2R. Pseudo group codes are of interest since they outperform group 
codes for point to point communication. 

7 Indeed, the coding techniques based on structured codes do not substitute for coding techniques based on unstructured 
codes. For example, reconstructing a pair of sources losslessly using two source codes that are partitioned using a common 
channel code can be strictly sub optimal. Similarly, the technique of partitioning independent source codes using independent 
channel codes is sub optimal for the problem of losslessly reconstructing modulo two sum of binary sources. 



3 



the largest known achievable rate region for the problem of distributed function computation. In the same 
spirit, in [29] we proposed the ensemble of nested coset codes as an ensemble of codes possessing algebraic 
structure that achieves capacity of arbitrary point to point channels. The technique proposed by Philosof 
and Zamir has been elevated to derive a a new achievable rate region |39j for an arbitrary discrete multiple 
access channel with distributed states. We employed this ensemble to derive a new achievable rate region 
for the general discrete three user interference channel. 

We propose a framework based on structured codes that enables us derive new achievable rate region 
described through information theoretic quantities for a general three user broadcast channel. Secondly, we 
propose the technique of joint typical encoding and decoding with codebooks possessing algebraic structure. 
Thirdly, our analysis of error events using correlated codebooks contains new elements. 

This article is organized as follows. We begin with preliminaries in section [5J In section [31 we introduce 
a binary additive three user broadcast channel for which Marton's rate region is strictly sub optimal. In 
section 21 we define the class of 3-to-l broadcast channels and generalize the coding technique proposed 
for the binary example to a general 3-to-l broadcast channel. Finally, in section [SJ we propose a coding 
technique for the general three user DMBC based on nested coset codes. 



2 Preliminaries 

A three-user discrete memoryless broadcast channel (DMBC) used without feedback is a sextuple (X, 34 , 3^2 , 3^3 , 
WVi, y 2 ,y s \Xi c ) where X is the input alphabet, 3^ for i = 1, 2, 3 arc the three output alphabets, a collection 
of distributions Wy lt y 2> Y 3 \x{'i ■) '\ x ) on 34 x 34 x 34> one for every x G X, and a cost function c : X —> M. + . 
The channel is assumed to be memoryless. 

Definition 1. An (n, ©i, 2 , ©3) transmission system for a given DMBC consists of an encoder mapping 

e : [0i] x [9 2 ] x [6 3 ] -> X n , 
where [0j] = {1, • ■ • , 0j}, and three decoder mappings 

n.-y; >\\ <->-! 

for i = 1,2, and 3. 

We assume that the messages (Mi, M2, M3) are drawn uniformly from the set {1, . . . , 0i}x{l, . . . , 02}x 
{1, . . . , ©3}. The cost associated with a vector x n of length n is additive and is given by 

n 

n t—' 

t=i 

The average error probability of the above transmission system is given by 
&i &2 e 3 

r= n n n E E E M(9i(Yn,92{Y?),g 3 (Y?))^(m 1 ,m 2 ,m 3 )\(M 1 ,M 2 ,M 3 ) = (m ll m Jl m 3 )), 

ra\ — \ m2 = l 7773 = 1 

and the average cost incurred is given by 

1 0i e 2 e 3 
C = 9,9,83 ^ 2 S c(e(mi,m 2 ,m 3 )). 

mi — 1 rn 2 — I ^3-1 
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Definition 2. A quadruple of rates and cost (i?i, R 2 , R3, C) is said to be achievable for a given DMBC if 
Ve > 0, there exists an N(e) such that for all n > N(e) there exists an (n, 0i, 02, 63) transmission system 
that satisfies the following conditions 

- log Qi > B4 -e, fori = 1,2,3, r < e, fx < C + e. 
n 

The set of all achievable rate triples at cost C is the capacity region of the DMBC at cost level C . 



3 Binary Example 

Wc present an example of a three user DMBC for which the natural extension of the largest known rate 
region, due to Marton is strictly sub-optimal. In particular, we identify a triple of rates that is achievable 
using a coding technique based on linear codes that is not contained in Marton's rate region. 



3.1 Description of the three user broadcast channel 

Let F2 = {0, 1} denote the binary field and ©2 denote addition in F2. The input alphabet is X = F2 XF2 XF2 
and the three output alphabets are 3^- = F2 : j = 1,2,3. The channel is depicted in Fig. [T] Let 
X = (X\, X2, X3) denote the input to the channel, where Xj G F2, and Yj G F2, the output at receiver 
j. That is, the channel has an octonary input and binary outputs. The channel transitions are described 
through the relations Y 2 = X 2 ©2 N 2 ,Y 3 = X 3 ffi 2 iV 3 and Yi = X x ffi 2 X 2 ffi 2 ^3 ©2 N 1: and 

• Ni, N 2 , N 3 are mutually independent, 

• (iVi,7V2,iV 2 ) is independent of (X 1 ,X 2 ,X 3 ), 

• for j = 2, 3, P{N 3 = 1) = e and P(iVi = 1) = S. 

• e,d£ (0,|). 

The input X is subject to an average cost constraint {u>#(X™)} < g, where wh is the Hamming weight 
function, and q G (0, \). We restrict to the case q * S < e, where q * 5 = (1 — q)5 + (1 — S)q. 




Figure 1: A 3-user broadcast channel with octonary input and binary outputs 
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3.2 An achievable rate region using linear codes 

We present a coding technique based on linear codes that achieves the following rate region 

{(i?i,i? 2 ,i? 3 ) : Ri <h b (q*6)-h b {S),R 2 <l-h b (e),R 3 <l-h b (e)}, 

where h b (-) is the binary entropy function. Let user 2 and 3 employ the same linear code that achieves 
capacity of a binary symmetric channel with crossover probability e. User 1 employs a nested linear 
code I29j that achieves capacity on a binary symmetric channel with cross over probability 6 and average 
input Hamming weight constraint q. Let X™ : j = 1,2,3 represent user j's codeword. The input to the 
channel is X n — (X^^X^ ,X$). Clearly, user 2 and 3 achieve their respective capacities. User 1 decodes 
X£ ©2 X% , the sum of user 2 and 3's transmissions. Since q * 6 < e, this is possible. Having decoded 
X% ©2 X% , user 1 decodes the intended signal. It is clear that user 1 can achieve a rate h b (q * 5) — h b (S). 



3.3 Sub-optimality of Marton's rate region 

In this section we prove (h b (q*6) — h b {5), 1 — h b (e), 1 — h b (e)) is not contained in the rate region obtained by 
the natural extension of that proposed by Marton for the two user DMBC when h b (5*q) < h b (e) < 1+h <>( s *i) m 
In particular, we prove that if {Ri, 1 — h b (e) , 1 — h b (e)) is achievable using Marton's technique, then either 
R\ < h b (q* 8) — h b (6) or Ri = h b (q * 6) — h b (5) and h b (e) > 1+h <>( 5 *^ m Towards that end, we begin with a 
characterization of the rate region proposed by Marton for the two user DMBC. 

Consider a two user DMBC with input alphabet X, output alphabets 3^1,3^2, channel transition prob- 
ability Wy^ix (•, •)•) an d cost function c : X — > M+. Let D^(Vt / V 1 y 2 |x, c) be the set of distribution^ 
Pwv^XYiYz defined over WxVixV , 2xA'x 3^1x3^2, where W, Vi and V2 are finite sets with max {|W|, |Vi|, IV2I} < 
1^1+4 such that 

• Py!Y 2 \x = W Yi y 2 \x, 

• WV1V2 — X — Y\Yi forms a Markov chain. 

For every distribution Pwv 1 v 2 xy 1 y 2 £ ^ir^Wy-iY^Xi c), let olji(Pwv 1 v 2 xy 1 y 2 ) be defined as the set of rate 
pairs and cost (Ri, i?2, C) such that there exists 6 non- negative real numbers K\, K2, Sx,T±, S2 and T2 that 
satisfy the following constraints for i = 1,2: Ri = Kj + Tj, and 

0<Si 
I{V 1 ;V 2 \W) < 5i + 5 2 

< K t 

< T, 
IiViiY^^Ti + Si 
I(WVi;Yi) >K X + K 2 +Ti + S i: 

and E[c(X)} < C. Let au(WY 1 y 2 \x, c) be the closure of the union of o.r(Pwv^v 2 x YiY 2 ) over all distributions 
Pwv 1 v 2 xy 1 y 2 &^r(W Yi y 2 \x,c). 

8 Letter R in the subscript stands for random codes 
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By doing Fourier-Motzkin elimination, one can easily show that Oin{PwViV2XY 1 Y 2 ) is equal to the set 
of all rate pairs and costs R 2 , C) such that 

0<i?i</(W i; F!) (1) 

0< R 2 < I(WV 2 ;Y 2 ) (2) 
Ri + R 2 < I(yx\Y x \W) + I(WV 2 ;Y 2 ) - I{Vr,V 2 \W) 

Ri + R 2 <I(V 2 ;Y 2 \W)+I(WV 1 ;Y 1 )-I{V 1 ;V 2 \W), (3) 

and £"[c(X)] < C. ctR(W Yl ,Y 2 \x, c ) is the Marton's achievable rate region, and is an inner bound to the 
capacity region. 

For clarity, let us give a brief interpretation of the rate region. Let n denote the blocklength. A random 
code is constructed from from the distribution Pw of rate K\ +K 2 . Let W(i) denote the ith codeword. For 
every codeword W(z), a code C\{i) is constructed with distribution Py 1 \w 01 ra -te T\ + S\ with W(i) used 
in the conditioning. A similar collection of codes is constructed with distribution Py 2 \w- Each T^-code is 
"partitioned" into bins of rate Si for i = 1, 2. Joint typical encoding and decoding is used on these codes. 
The standard error analysis gives the rate pairs mentioned above. 

We now discuss the natural extension of Marton's rate region to a three user DMBC with input alphabet 
X and three output alphabets y,- : j = 1,2,3 and transition probabilities W Y \x (Y = {Yi,Y 2 ,Ys)). Let 
oir{Wy 1i Y2.y 3 \x j c) denote the set of all triples of rates and costs that belong to the natural extension of 
the Marton's rate region. The most compact description of aR(WY 1 .Y 2 ,Y 3 \Xi c ) that we are aware of is still 
very long, and is given in the Appendix. 

Theorem 1. Consider the 3-receiver DMBC given in the binary example for q,e,S £ (0, i), and q* 5 < e. 
If (Ri, 1 - h b (e), 1 - h b (e)) G a R , and h b (e) < 1+hb ^ , then 

Ri < h b (6 *q)- h b {5). 

Proof. See Appendix. □ 

It follows from the proof of this theorem, that for e such that h b (e) > kthil3±°l^ the rate point (h b (q * 
5) — h b (e), 1 — h b (e), 1 — h b (e)) belongs to the natural extension of the Marton's rate region. 

4 3-to-l broadcast channel 

4.1 Functional Perspective on Marton's Coding 

To enable us to state our new coding results succinctly with as few auxilliary random variables as possible, 
we will revisit the Marton's rate region for two-receiver DMBC and see how the coding can be done using 
coset codes. Since coset codes induce uniform single-letter distribution, we cannot perform conditional 
coding. In essence all codebooks are created from uniform distribution. Second, we can look at the 
broadcast channel from the perspective of interference channel. As noted by [40j , in the two- user broadcast 
channel, the signals meant for the two users interfere with each other. It behooves each receiver to decode 
a part of the interference, i.e., a part of the signal meant for the other receiver, before decoding its own 



7 



signal. To get even better performance, they can be decoded jointly. This part of the interference can take 
the form of the output of a univariate function of the signal meant for the other receiver. 

When we go to the three-receiver broadcast channel, we make the case for the part of the interference 
that is decoded at a receiver to take the form of the output of a bivariate function of the signals meant for the 
other two receivers. Consequently, group-theoretic approaches play an important role in the constructions 
of codes that take into account the structure of these bivariate functions. This is in contrast to the natural 
extension of Marton's coding approach where a pair of univariate functions of the signals meant for other 
two users are reconstructed at each receiver. A general approach that combines the two coding schemes 
can be obtained along the lines of the seminal work of Ahlswede and Han [M] . 

To make these concepts more concrete, consider a two-receiver DMBC (X, y%, 3^2) Wyi,y*\Xi c )- Let 
^ > l(Wy 1 ,y 2 \X: c ) denote the selQ of triples {Pui,u 2 ,x,y-l,Y 2 > 9ii 9i) of (a) probability distribution on the set 
U\ x U 2 x X x y x x y 2 , such that (t/i, U 2 ) - X — (Y 1; Y 2 ) and Py 1 .Y 2 \x = W Yu y 2 \Xi where U\ and U 2 are 
finite sets, and (b) two univariate functions gi : Ui — > Ui. Let U%j = gi(Ui) for i,j = 1,2, and i ^ j. Let 
max,; \gi(Ui)\ < p r , a prime power. 

The first decoder jointly decodes (Ui, U21) and the second decoder jointly decodes (U 2 , Ui 2 ). To enable 
the decoders decode parts of the interference, a two-level information coding procedure is employed. A 
code is constructed from each of the four variables U\, U 2 , U\ 2 and U 2 \. This can be informally interpreted 
as imposing the constraint that the code of U\ is "closed" under the univariate function gi(-), and similarly 
for the code on U 2 . 

Let us fix some notation before we proceed further. Let A denote the set of all subsets of {1, 2, 12, 21} 
such that (a) if 2 is present in the subset then 21 must also be present and similarly (b) if 1 is present 
in the subset, then 12 must also be present. When we have four real numbers, 5*1, S 2 , S± 2 and 21 (one for 
each element of {1, 2, 12, 21}), for all Q £ A, let S® = X) a ee ana - ^ e ^ s ^ ne collection {U a : a g 0}. 
Similarly, let A\ denote the set of all subsets of {1, 12,21} that contains the element 1. For G A\, let 
9 C denote 8 C n {1, 12, 21}. Similarly, A 2 is defined. 

For every such triple, let ctL(Pui ,u 2 x ,Yi ,y 2 , 91, #2) be the set of all rate pairs and costs (R\,R 2 , C) such 
that there exists 8 non- negative real numbers Sij,Si,Tij and Ti for i,j = 1,2 and i ^ j, that satisfy the 
following for all i,j = 1, 2, and i ^ j: (a) Ri = Tj — Si + — Sy (b) Tj > Si, > Sij and 5,j < logp r , 
(c) covering condition for all e A, 

\e\logp r - H(U e ) < S e 

and (d) packing condition for all £ Ai for i = 1,2, 

\e\logp r - HiUepe^Y,) >T e 

and E[c{X)\ < C. Let aL{WY 1 ^Y 2 \x, c) denote the closure of the union of otL{Pui,u 2 ,x,Y-L,Y 2 -,gi,g2) over all 
®l(W Yi ,y 2 \x,c). 

Proposition 1. a R (W Yu Y 2 \x, c) = a L (W Yl ,Y 2 \x, c) 

9 The latter L in the subscript stands for linear codes 
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4.2 Coding Theorem for 3-receiver Broadcast Channel 

In order to explain the proposed scheme and it's novelty, we describe the same for a particular class 
of 3-to-l broadcast channels. A DMBC (X, y±, y 2 , 3^3) W Yl ,Y2,Y 3 \x) is a 3-to-l broadcast channel if the 
input alphabet can be expressed as a cartesian product of three alphabets X = X\ X X 2 x A3 such that 
W Y2 \x(y2\(xi,x 2 ,x 3 )) = W Y2 \x 2 (m\xi) and W Y3 \ x (y3\(xi,x 2 ,x 3 )) = W Y3 \x 3 (V3\xa)- Notc tnat transition 
probabilities W Yii y 2 ,y 3 \x of a 3-to-l broadcast channel can be denoted as W Yi ,y 2 ,y 3 \x 1 x 2 x 3 - 

Following is the first main result of this paper. For a 3-to-l broadcast channel whose transition prob- 
ability is Wy 1} y 2 .y 3 \x 1 .x 2 .x 3 and cost function is c(-), let TS>l{W,c) denote set of triples (P,g 2 ,g 3 ) of (a) 
probability distribution Pw,uf,xf,Y^ defined over W ®\lAi X X^ x having the following properties 

• Ui is a finite set for i = 1, 2, and 3, and W is a finite set, 

• ^Vi.Ya^slA-i.Xa.Xa = l^V, ,V2,y 3 |X, ,A' 2 ,A' 3 i 

. (W, f/i, U 2 , U 3 ) - (X U X 2 , X 3 ) - (Y u Y 2 ,Ya) is a Markov chain 

and (b) univariate functions gi : Ui — ► Ui for i = 2, 3. Let p r be a prime power such that max^ < p r . 

Let C/ji = gi{Ui) for i = 2, 3, and let f/j = U 2 \ + Uai denote the sum of U21 and U 3 i in the unique finite 
field of size p r . 

Let B denote the set of all subsets of {1, 2, 3, 21, 31} such that (a) if 2 is present in the subset then 21 
must also be present and similarly (b) if 3 is present in the subset, then 31 must also be present. Let B\ 
denote the set of all subsets of {1, 1} that contains the element 1, and let Tj = max{T2i, T31}. Let Bi 
denote the set of all subsets of {i,il} that contains the element i, for i = 2,3. 

For every triple (P, 51,32) £ ©l, let cxl(P, 31,32) be the set of all quadruples of rates and costs 
(i?i, i?2, R 3 , C) such that there exists 10 non-negative real numbers Si,Ti for i = 1,2,3 and Sn,Tn for 
i = 2, 3 that satisfy (a) Rt = T x - S u R 2 = T 2 - S 2 + T 21 - S21 and R 3 = T 3 - S* 3 + T31 - S 31 , (b) T t >S u 
\ogp r < Si for i= 1, 2, 3 and Tn > Su and logp r > Sn for i = 2, 3, (c) covering condition: for all 9 6 £>, 

|6|logj5 r -H{U e \W) < S e 
and (d) packing conditions: for all € Bi for i = 1, 2, 3, 

|6|logp r -If(f7e|[^,^,W0 >T e 
and S[c(Xi, A 2 , A 3 )] < C, and let a^W^c) denote the closure of the union of 0^(^31,32) over all U>l- 
Theorem 2. For every 3-to-l DMBC (W,c), every quadruple in cxl{W, c) is achievable. 
Proof. A proof will be provided in a detailed expansion of this paper. □ 

In the following we give an outline of the coding scheme used to achieve this rate region. We omit the 
formal proof in the interest of brevity. 
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4.3 Outline of the coding scheme 

The nature of the 3-to-l broadcast channel indicates users 2 and 3 need not decode any parts of the other 
users' messages. With this in mind, we propose an encoding scheme based on 5 codebooks. User l's 
message is communicated using a single codebook built on U\. User 2's message M 2 is split into two parts. 
The codebook built over U21 carries message M 2 i which is a part of M 2 . The codebook built over IA2 carries 
the rest of user 2's message. Similarly, the codebook built over U^i carries message M 31 which is a part of 
user 3's message M3. The codebook built over U3 carries the rest of user 3's message. 

Marton's rate region involves decoding M21, M31, Mi at decoder 1, M 2 at decoder 2 and M3 at decoder 3. 
This involves user 1 decoding U% x , U 3l , {/", the codewords corresponding to M21, M31 and Mi respectively. 
The key difference in the coding technique we propose is to let decoder 1 decode + U]$3 instead of 
U21 , f/31 ■ While this retains uncertainty in U21 , U 31 , thus enabling larger individual rates for users 2 and 
3, when appropriately chosen, U 2l + U 31 contains sufficient information of the interference pattern. 

In order to constrain the number of possibilities for + , yet keeping the size of each codebook 
large, the codebooks over U21 and U31 are chosen to be cosets of a common linear code. Deriving source 
coding bounds for correlated linear codes introduces new elements. Similarly, decoding codewords from 
correlated channel codes introduces new elements. In the interest of brevity, we omit the details and give 
only the key concepts. 

Fix a triple (P,gi,g 2 ) £ ©^(W 7 , c). We have 3 primary auxilliary random variables U\, U2 and C/3, and 
two functions g\ and <? 2 . From these we get two secondary auxilliary random variables as U21 = 32(^2) 
and C/31 = gs{U-i). Generate a random sequence W from the product distribution P^. We will construct 
5 codes, one for each random variable. The codes are built over the common finite field of size p r . Let Ci 
denote the code associated with the random variable Ui, and similarly let Cji denote the codes associated 
with Uji for j = 2, 3. 

The codes are constrained to be coset codes to facilitate the first decoder decode U21 + t/31. The codes 
C21 and C31 are nested within each other. In other words, if IC2 1 1 < C31L then we let C 2 i Q C31 and 
vice versa. The 5 coset codes are constructed by choosing the generator matrices randomly uniformly and 
independently enforcing the nesting structure between the codes of U21 and C/31. We will probabilistically 
partition these codes into bins. Let Bji(i) denote the ith bin of the code associated with Uji for j = 2, 3, 
and similarly, let Bj(i) denote the ith bin of the code associated with Uj for j = 1, 2, 3. The bins, however, 
will not have the structure of coset codes. That is, the finer codes are coset codes, and coarser codes are 
random codes. The reason for choosing such an ensemble is as follows. First, we only need the outer 
codes to be coset codes so that algebraic structure could be used to enable the first receiver decode the 
interference pattern. More importantly, if we let the coarse codes (bins) to be coset codes, it turns out 
that we lose some performance as compared to random bins. 

We use letter S to denote the rates of the bins of the random variables. We use letter T to denote the 
rates of the codes of the random variables, respectively. For example, the rate of the bins of U21 is given 
by 52i, and that of U\ is Si. The transmission rates are given by: Ri = 2\ — Si, R% = T 2 — S2 + T 2 i — S21, 
and R 3 = T 3 - S3 + T31 - £31. 

Encoding: The encoder is given three messages Mi,M 2 and M3 of rates Ri, i? 2 , and R3, respectively, 

10 Recall, U21 and W31 are finite sets of cardinality p r and can thus be associated with a common finite field. 
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to transmit. The encoder splits the second message as M21 and M 2 2, of rates T21 — S21 and T2 — S2, 
respectively. Recall that U21 = (72(^2) and U31 — (72(^3)- The encoder selects the quintuple of bins 

(B21 (M 21 ), Bsi (M 31 ), Bi (Mi ) , B 2 (M 22 ) , B 3 (M 32 ) ) 

and looks for a quintuple of vectors, one from each that is jointly typical with W with respect to the 
distribution Pwu 2 i,Usi Ui u 2 ,Us- If there is at least one such quintuple, then the encoder selects one of 
them and obtains a channel input vector by applying a random transformation of it using Px\Ui,u 2 ,u 3 and 
sends the vector over the channel. If 110 such quintuple is available, the encoder declares error and sends a 
randomly chosen channel input vector. 

It can be shown that the probability of encoding error asymptotically approaches zero if the rates of 
the bins are not too small. In particular, we need to have the following conditions: for all £ B, 

S e > \e\log P r - H(U e \W) 

Decoding: The first decoder receives the corresponding channel output vector, and looks for a unique 
vector pair, one from C\ and one from the larger of C 2 i and C31, that is jointly typical with the channel 
output vector and the sequence W with respect to the distribution Pw,Ui,u 3 i+u 2 x,y 1 - If there is such a 
pair, let Mi denote the index of the bin in C\ that contains the first vector, and declares the reconstructed 
message as M\. Otherwise, it declares error and selects a random message for reconstruction. 

The second decoder receives the corresponding channel output vector, and looks for a unique vector 
pair, one from C21 and one from C2 that is jointly typical with the channel output vector and W with 
respect to the distribution Pw,u 2 i,u 2 ,y 2 - If there exists such a pair, let M21 and M 22 denote the indexes 
of the bins in C21 and C2 that contains the unique vector pair. The decoder declares the reconstructed 
message as (Af 2 i,M22). A similar decoding strategy is employed at the third receiver. It can be shown 
that the probability of decoding error at all the receivers goes to zero if the rates of the codes are not too 
high. In particular, we need to have for all £ Bi for i = 1,2, 3, 

|0| logp r — H(Ue\Ue<>Yi, W) > Tq 

5 General Broadcast Channel 
5.1 Coding Theorem 

In this section, we consider the general DMBC with 3 receivers. We present an achievable rate region 
for this channel using coset codes. This is the second main result of the paper. For a given broad- 
cast channel described by {Wy 1 ,y 2 ,y 3 \x,c), let ©l(W, c) denote the set of all (a) probability distributions 
Pw,u 1 ,u 2 ,u 2 ,x,y 1 .y 2 ,y 3 i defined over the sets W x U\ x U2 x Uz X X xj^i x y 2 x 3^3 having the following 
properties: 

• Ui is a finite set for i = 1, 2, 3 and W is a finite set, 

• (W, Ux,U 2 , U 3 )-X- (Yi,Y 2 ,Y 3 ) is a Markov chain 
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and (b) six functions g^j :Ui ~^lAi for all i,j 6 {1, 2, 3}, i ^ j. Let Uji = gjiiUi)- Let be a prime power 
such that \Ui \ < , and 

{je{l,2,3}:j5^j} 

for i = 1,2,3. Hence can we endow the alphabets of ^21,^31, and L/i with the algebraic structure of the 
unique finite field of size p^ 1 , and similarly for the triples (U12, U32, U2), and (C/13, C/23, U3). 

We will use the following notation for a compact description of the rate region. We will use double 
indexed rates such as Sij and double indexed random variables such as f/y, where each index can take 
values in / = {1, 2, 3} and i 7^ j. For a given pair let k denote the element in / such that k ^ i and 
k ^ j. For example when = (1, 3), k = 2. Let fZg denote the collection of random variables {Uij, Uik}- 
Let Sn denote the sum rate Sij + Sik ■ For example Sxi = 613 + Si 2. Let XJ~ { denote the sum Uji + Uki, where 
+ is the addition operation of the corresponding finite field. Let Tj denote max{Tji, T^}. For example 
Tj = max{T2i, T31}. Observe that in ~ notation, the index i becomes the second index. For every element 
s G {1,2,3, 12, 13, 21, 23, 31, 32, 1, 2, 3}, let A s denote the size of the finite field associated with the alphabet 
of U s . A similar notation is used for subsets. Let ni be a permutation on the set {ij, ik, i}, for i = 1, 2, 3. 
Let B denote the set of all subsets of {1, 2, 3, 12, 13, 21, 23, 31, 32} such that if i is present in the subset, 
then ij and ik must also be present. A curious reader may note that \B\ = 214. 

For every PMF P and six functions <7y(-) in 3l, let ctL{P,gij) denote the set of all quadruples of 
rates and costs (i?i, R2, R3, C) such that there exists 18 non- negative real numbers Tj, Si,Tij , Sij for all 
i,j £ {1,2,3}, i 7^ j, and 3 permutations 7Ti,7T2 and 113, that satisfy the following for all i,j £ {1,2,3}, 
i ^ j, (a) Ri = Tj — Si + Tfi — S^, (b) TJy > Sij, Tj > Si, Sij < Aij and Si < Ai, (c) covering conditions: 
for all 6 G 23, 

A e ~H(U e \W) <S e 

and (d) packing conditions 

Ti < Ai - H(Ui\U fi ,U h Yi,W) 
Ti + T ni (ij) < Ai + A v .(ij} — H(Ui,U w .^j)\U n .^,U^ i( jyYi,W) 
T ni (ik) < A^iiik) ~ H(U m (ik) \U Vi (i), Yi, W) 
T^KA^-HiU^Y^W) 

and E[c(X)] < C. Let aL(W, c) denote the convex closure of the union of az,(P, gij) over all H>l- 

Theorem 3. For every DMBC (W, c), every quadruple (Ri, R2, R3, C) in ul is achievable. 

Proof. A proof will be provided in a detailed expansion of this paper. □ 



5.2 Coding Scheme 

Each receiver is assigned a main auxilliary random variable. For example Ui for receiver i. The first 
receiver wishes to reconstruct a bivariate function of a part of the signal meant for receiver 2 and a part 
of the signal meant for receiver 3. This is given by U\ — U21 + f/31 = 321(^2) + 331(^3)- Similarly, 
U 2 = U12 + t/32 = 312(^2) + 332(^3) is decoded at receiver 2, and so on. We employ a form of successive 
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decoding strategy at each receiver. The first decoder attempts to recover the following (U12, U13, C/j, U\). 
A permutation of {12, 13, 1} is chosen. For example (13, 1, 12). Then J7 13 is decoded first, and then J7j is 
decoded next, and then the pair (t r i2,t r i) is decoded jointly. Each random variable is assigned a nested 
code. The fine code is a coset code, and and coarse code is obtained by "probabilistic" partitioning of the 
fine code. The codes associated with [/12, U32 and U 2 are constructed on the unique finite field of size p r 2 2 , 
and those associated with U13, U23 and U3 are constructed on the unique finite field of size p^ 3 , and so on. 
The fine codes of U12 and U32 are nested within each other depending on their sizes. The codes associated 
with Ui are generated independently of the other codes. 
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Appendix 

Proof. (Theorem [1} : 

Let B^(M /r y 1 y 2 |x, c) be the set of distributions PyyuvXY defined onVVxUxVxXxy, where 

1. y = yi x y 2 x y 3 

2. W is a finite set 

3. U = IA\2 x U31 x U23 and V = Vi x V2 x V3 are cartesian product of three finite setJ"1 
such that, 

1. WUV - X - Y is a Markov chain, 
2 - Py\x = Wy\x- 

Let an{P W ijyxY) be the set of rate triples and costal ( R\ , R2 , R3 , C) such that three exists 18 non- 
negative real numbers Ki, K 2 , K 3 , K 12 , L 12 , S12, K 31 , L31, S31, i^23,^23,S , 23 and Ti, 5i,T 2 , 5 2 , T 3 , 5 3 that 
satisfy the following 3 rate splitting constraints R\ = K\ + K 12 + L31 + T\, R 2 = K 2 + K23 + L i2 + T 2 , 



11 Bounds on the cardinalties of U and V and W are not available in the literature yet. 
12 This is the three user analogue of aR{Pwv 1 v 2 XY 1 Y 2 )- 
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and i? 3 = K 3 



T3, 11 covering constraints 

I(U 12 ;U 23 \W) <S 12 +S 23 
I(U 12 ;U 31 \W) < S 12 + S 31 
I{U 23 ;U 31 \W) <S 23 + S 31 
I(U 12 ;U 23 ;U 31 \W) <S 12 + S 23 + 
I(U 12 ; U 23 - U 31 \W) + I(V i; U 23 \U 12 U 31 W) < Si + S 12 + S 23 + S 31 
I(U 12 ; U 23 ; U 31 \W) + I(V 2 ; U 31 \U 23 U 12 W) < S 2 + S 12 + S 23 + S 31 
I(U 12 ; U 23 ; U 31 \W) + I(V 3 ; U 12 \U 31 U 23 W) < S 3 + S 12 + S 23 + S 31 

S 2 + 5*12 + "23 + S3I 

- I(U 12 ;U 23 ;U 31 \W) 



I(VuU 23 \U 12 U 31 W) + I(V 2 ;U 31 \U 12 U 23 W) + I(V i; V 2 \U 12 U 23 U 31 W) < S 1 
I(V 2 ;U 31 \U 23 U 12 W) +I(V 3 ;U 12 \U 23 U 31 W) + I(V 2 ;V 3 \U 12 U 23 U 31 W) < S 2 
I(V 3 ;U 12 \U 31 U 23 W) +I(V i; U 23 \U 31 U 12 W) + I(y a ;Vi\U u U M U 31 W) < S 3 
I(V i; U 23 \U 12 U 31 W) + I(V 2 ; U 31 \U 12 U 23 W) + I(V 3 ; U 12 \U 31 U 23 W) < 5j 



S 3 + Si 2 + S 23 + S 3 i 

- I{U 12 ;U 23 ;U 31 \W) 
Si + £12 + s 23 + S 3 i 

- I{U 12 ;U 23 ;U 31 \W) 



S 2 + 5*3 + £12 + $23 - 

I{Vi;V 2 ;V 3 \U l2 U 2 3U 31 W) 
I(U 12 ;U 23 ;U 31 \W) 



S: 



:S1 



and 15 packing constraints. 



KWU^xV^Yx 
I 

I(U 12 V 2 ;Y 2 \WU 23 
I(U 23 V 2 ;Y 2 \WU 12 
I(U 12 U 23 V 2 ;Y 2 \W 
I(WU 12 U 23 V 2 ;Y 2 

I 

I(U 31 V 3 ;Y 3 \WU 23 
I(U 23 V 3 ;Y 3 \WU 31 
I{U 23 U 31 V 3 ;Y 3 \W 
I(WU 23 U 31 V 3 ;Y 3 



VuYxlWUuUs!) >T 1 + S' 1 
+ I{U 12 ; U 31 \W) > K 12 + L 12 + S 12 +Ti + Si 
+ I(U 12 ; U 31 \W) > K 31 + L31 + S 31 +T 1 + S 1 
+ I(U 12 ; U 31 \W) > K 12 + L 12 + S 12 + K31 + L31 + £31 + Ti + S x 
+ I(U 12 ; U 31 \W) >K X + K 2 + K 3 + K 12 + L X2 + S 
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A' 



31 



L31 + S 31 +T 1 + S 1 



V 2 ;Y 2 \WU 23 U 12 )>T 2 + S 2 

+ I(U 12 ; U 23 \W) > K 12 + L 12 + S 12 +T 2 + S 2 

+ I(U 12 ; U 23 \W) > K 23 + L 23 + S 23 +T 2 + S 2 

+ I(U 12 ; U 23 \W) > K 12 + L 12 + S*i 2 + K 23 + L 23 + S 23 + T 2 

+ I{U 12 ; U 23 \W) > K 1 + K 2 + K 3 + K 12 + L 12 + S 12 + K 23 - 

V 3 ;Y 3 \WU 23 U 31 )>T 3 + S 3 

+ I(U 23 ; U 31 \W) > K 31 + L31 + S 31 +T 3 + S 3 

+ I(U 23 ; U 31 \W) > K 23 + L 23 + S 23 +T 3 + S 3 

+ I(U 23 ; U 31 \W) > K 23 + L 23 + S 23 + K 31 + L 31 + S 31 + T 3 

+ I{U 23 ; U 31 \W) > Ki + K 2 + K 3 + K 23 + L 23 + S 23 + K 31 - 



-S 2 
L 23 



S 2 



T 2 + S 2 



S 3 
L31 



S 



31 



T 3 
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Let cxr(W y \xi c ) be the convex closurco of the union cxr(Pwuvxy) over a H distributions Pwuvxy £ 
Dr(Wy\xt c )- The characterization of (xr(Pwuvxy) can be obtained by identifying the covering and 
packing bounds in terms of rates of the 7 codebooks, eliminating the variables that are not of interest using 
the technique of Fouricr-Motzkin and expressing the rate region in terms of the three rate parameters 
Ri, R 2 , ^3- However this procedure turns out to be cumbersome for the three user channel and yields in 
excess of 100 inequalities that are needed to characterize cxr(Pwuvxy) ■ The lack of a compact charac- 
terization of oir{Pwuvxy) i s onc °f the key difficulties in establishing the sub-optimality of Marton's rate 
region. We circumvent this difficulty as follows. Suppose l — hb(e), l — hb(e)) £ o.r(Wy\xi c )> then there 
must exist a distribution, say Pwuvxy G ^>r(Wy\xi c ) sucn that (Ri, 1 — hb(e), 1 — /i&(e)) € a R{PwuvxY)- 
This is because of the extremal nature of the operating point in the second and the third coordinates. That 
is, R 2 = 1 — hb(e) is the the maximum rate at which communication can take place between the encoder 
and the second decoder, and similarly for the third decoder. Hence (i?i, 1 — hb(e), 1 — hb(e)) cannot be a 
convex combination of points where the second or the third coordinate is strictly larger than or strictly 
smaller than 1 — hb{e). 

Lemma 1. If (Ri, 1 — hb(e), 1 — hb(e)) £ &r(Pwuv xy) > then the following conditions must be satisfied. 



K x = K 2 = K 3 = K 23 = L 23 = K 12 = L 3 i = S 2 = S 3 = S 23 = (4) 

(Y 2 ,X 2 ,V 2l U 12 ) - (WU 23 ) - (V 3 ,U 31 ,X 3 ,Y 3 ) (5) 

{V 3 ,X 3 , V U V 13 ) - (WU 23 U 12 V 2 ) - (X 2l Y 2 ) (6) 

{V2,X 2 ,V U V 12 ) - (WU 23 U 31 V 3 ) - {X 3 ,Y 3 ) (7) 

Sia = T(U 12 ; U 23 \W), and S 31 = I(U 31 ;U 23 \W) (8) 

and iWU 23 ) is independent ofY 2 and is independent ofY 3 . 



Using these relations, we can now simplify the 3 rate splitting, 11 covering and 15 packing constraints 
as follows: R 2 = R 3 = 1 - h b (e) = I{V 2 U 12 ;Y 2 \WU 23 ) = I(V 3 U 31 ; Y 3 \WU 23 ), R 2 > T 2 , R 3 > T 3 , 

IiVuUmVMlWUuUai) < Si 

I(V 2 ;Y 2 \WU 12 U 23 ) >T 2 >0 
I(V 3 ;Y 3 \WU 3 iU 23 ) >T 3 >0 
I(Vi;Y!U 23 \WUi 2 U 3 i) >Ri + Si 
I(ViU 12 ; YiU 23 \WU 31 ) + I(Ui 2 ; U 31 \W) - I(U 23 ; U 12 \W) >R 2 ~T 2 + R 1 +S 1 
I(ViU 3 i;YiU 23 \WUi 2 ) + I(Ui 2 ; U 31 \W) - I(U 23 ; U 3 i\W) >R 3 -T 3 + R 1 +S 1 
I{ViUi 2 U 3 i-YiU 23 \W) + I(U 12 ; U 3 i\W) - I(U 23 ; U X2 \W) - I(U 23 ; U 3 i\W) >R 2 -T 2 + R 3 -T 3 + Ri + Si, 

where we have added the following 4 non-negative quantities to the left hand sides of the last 4 equations, re- 
spectively: I{Vi;U 23 \YiWU 3 i), I(ViUi 2 ;U 23 \YiWU 3 i), I(ViU 3 i;U 23 \YiWUi 2 ), and I^U^Um; U 23 \YiW) 
which will only weaken the constraint and in turn enlarges the Marton's rate region. 

13 It is not yet clear whether the closure of the union is convex or not. 
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Now we are ready for a Fourier-Motzkin elimination. After such an elimination, we finally get the 
following rate region. R 2 = R 3 = 1 - h b (e) = I(V 2 U 12 ; Y 2 \ WU 23 ) = I(V 3 U 31 ; Y 3 \WU 23 ), and 

Ri < HVuY^WU^UzsUs!) - I{Vr,V 2 V 3 \WU l2 U 23 U 31 ) (9) 

Ri < /(Vi;Fi|Wi 2 C7 2 3C/3i) - I(Vi;V 2 V 3 \WU u U 23 U 31 ) + I(U 12 ; Yi\WU 23 U 3 i) - I(U 12 ;Y 2 \WU 23 ) (10) 

Ri < /(Vi^il^E/iatfastfai)- /fa; (11) 
Ri < IiV^Y^WUuUaaUai) - I(V V ,V 2 V 3 \WU 12 U 23 U 31 ) + I(U 12 U 31 ;Y 1 \WU 23 ) - I(U 12 ;Y 2 \WU 23 ) 

- I(U 31 ;Y 3 \WU 23 ) (12) 

Now let us look at the first equation (equation [5]) in the above four. Using the Markov chains 
{V 3 , X 3 , Vi, U 13 ) - (WU 23 U 12 V 2 ) - (X 2 , F a ), and (V 2 ,X 2 , V±,U 12 ) - (WU 23 U 31 V 3 ) - (X 3 , Y 3 ), and denoting- 
quadruple (WU\ 2 U 23 U 3 i) as W, and the sum X 2 + X 3 as S\, we get 

Ri < UVxiYxlWUviUnUai) - I(V i; V 2 V 3 \WU 12 U 23 U 31 ) (13) 

= HV^IWU^M - I(V i; V 2 V 3 X 2 X 3 \WU 12 U 23 U 31 ) (14) 

< IiV^lWUuUwUs!) - I(Vr,X 2 ,X 3 \WU 12 U 23 U 31 ) (15) 

< I(V 1 ;Y 1 \WU 12 U 23 U 31 )-I(V 1 ;X 2 +X 3 \WU 12 U 23 U 31 ) (16) 
= Y. P w(™) [l(Vi;Y 1 \W = w)-I(V 1 ;S 1 \W = w)\ , (17) 

where the second equality follows from the second and the third Markov chain of Lemma [1] An astute 
reader can make the connection between the right hand side of the last equation and the capacity of the 
Gclfand-Pinsker channel [JT|. Observe that the random variables appearing on the right hand side of the 
last equation, have the following probability mass function 

"w ™Vi \w"xi\Vi,Si,W ™Yi \Xi,S-l 

Consider the following binary Gclfand-Pinsker channel with Y = X @ 2 S ® 2 N , where X, S and N 
are binary valued, and P(N = 1) = S, P(S = 1) = a. N and S are independent. Let I : {0,1} — > R + 
be a cost function with 1(0) = and 1(1) = 1. Let C(q, a, S) denote the Gelfand-Pinsker capacity of this 
channel with the non-causal observation of the side information S at the encoder with cost constraint of 
q. Consider the following lemma. 

Lemma 2. For every triple (q,a,S) 6 (0, ^), we have 

C(q, a, 6) < h b (S *q)- h b (5), 

and equality holds if and only if any one of q,6 belongs to the set {0, }-} or a = 0. 
Using this lemma, we can see that for q, S € (0, ^), we have 

i?i < h b (S*q) - h b (S) 

unless H(X 2 + X 3 \W) = 0. This along with the Markov chain (A 2 ,U 2 ,Ui 2 ) - (WU 23 ) - (X 3 ,V 3 ,V 13 ) of 
Lemma □ implies that H(X 2 \WU 23 U 12 ) = H(X 3 \WU 23 U 31 ) = 0. 
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Now collecting all the results, we make the following closing arguments. If the first upper bound on 
Ri, (see equation[9]) as given by I(Vi; Y\ \WUi 2 U 23 U 3 i) — NV\\ V 2 V 3 \WUi 2 U 23 U 3 i) is strictly smaller than 
hb(S * q) — hb(5), then we are done, and R\ < hb(S * q) — hb(S). 

If not, we have equality all the way from equation [T2] to equation 1171 and we must have 

I(y 1 -,Y 1 \W)=I(V 1 -,Y 1 \W,X 3 ,X 3 ) 

= I(X 1 ;Y 1 \W,X 2 ,X 3 ) = H(X 1 © 2 N 1 \W,X 2 ,X 3 ) - H{N X \W ,X 2 ,X 3 ) 
= h b (q * S) - h b (5) 

Now looking at the fourth bound on i?i (see equation IT2"j). and using the fact that (WU2 3 ) is independent 
of Y 2 and independent of Y 3 (see Lemma [T]) we get 

I(U 12 U 31 ;Y 1 \WU 23 ) - I(U 12 ;Y 2 \WU 23 ) - I(U 31 ;Y 3 \WU 23 ) = I(X 2 ,X 3 ;Y 1 \W) - I{X 2 ;Y 2 ) - I(X 3] Y 3 ) 

< 1 - H{X 1 © 2 N X \WX 2 ,X 3 ) - 2 - 2h b (e) 
= 2h b {e) - h b (d *q)-l 

< if h b (e) < ^ - 

Hence we have shown that if e is such that 

h b (S *q) < h b {e) < , 

then Ri < h b (q * S) — h b (5). □ 

Proof. (Lemma [1]): Substituting the relation R 2 = K 2 + L\ 2 + X 23 + T 2 in the 10th packing constraint we 
get 

i? 2 + Ki + K 3 + K 12 + L 23 + S 2 < I(WU 23 U 12 V 2 ;Y 2 ) + I{U 12 ; U 23 \W) - S 12 - S 23 

< I(WU 23 U 12 V 2 ;Y 2 ) < HWUiaUiaUuiViViVaXiXaiYi) 
= I(X 2 ;Y 2 ) <l-h b {e) 

Since R 2 = 1 — h b (e), we must have equality everywhere in the above equation. Hence, we get K\ = 
K 3 = K 12 = L 23 = S 2 = and S 12 + S 23 = I(U 12 ; U 23 \W). Moreover, we have (V^, V 3 ,X 3 , U 13 ) - 
(WU 23 Ui 2 V 2 )—Y 2 . Since X 2 and Y 2 are related by a binary symmetric channel, using elementary probability 
argument, it can be easily shown that (Vi, V 3 , X 3 , U13) — (WU 23 Ui 2 V 2 ) — X 2 . Using a similar argument 
for the third receiver we get K\ = K 2 = K 23 = L 3 \ = S 3 = 0, and ^23 + S 3 i = I (U 23 ; U 3 i\W) , and 
(^15^2,^2, Ui 2 ) — (WU 23 Ui 3 V 3 ) — (X 3 ,Y 3 ). Using these in the fourth covering constraint we get 

I{U X2 ;U 23 \W) + I{U 23 ; U 31 \W) - S 23 = S 12 + S 23 + S 31 

> I(Ui 2 ; U 23 \W) + I(U 12 U 23 ; U 3X \W) 

Hence we get 

0<S 2 3<-/(t/i2;t/3i|V^723)<0, (18) 
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which implies that S23 = 0, and Ui 2 — (W, U23) — E/31. Substituting the condition that S 2 = S3 = in the 
9th covering constraint gives us 



= I(V 2 ; U 31 \WU 12 U 23 ) + I(V 3 ; U 12 \WU 13 U 23 ) + I(V 2 ;V 3 \WU 12 U 23 U 31 ) 
= H(V 2 \WU 23 U 12 ) + H{V 3 \WU 23 U 31 ) - H(V 2 , V 3 \WU 23 U 12 U 31 ). 

This implies that V 2 - (WU 12 U 23 ) - (U 31 V 3 ), and V 3 - (WU 31 U 23 ) - (U 12 V 2 ). This relation along with 
U12 — (W, U 23 ) — U 3 i gives us (V 2 ,Ui2) — (WU 23 ) — (V 3 , U31). Using the above relations in the 12th and 
15th packing constraints, we get 

I(X 3 ;Y 3 ) = R 3 < I(U 31 V 3 ;Y 3 \WU 23 ) < I(X 3 ;Y 3 ) 

and 

I(X 3 ;Y 3 ) =R 3 < I(WU 23 U 31 V 3 ;Y 3 \WU 23 ) < I(X 3 ;Y 3 ). 

Combining these two equations, we get the constraint that (WU 23 ) is independent of Y 3 , and similarly 
independent of Y 3 . 

□ 

Proof. (Lemma [21): The Gelfand-Pinsker capacity C(q,a,8) is given by [271I4T] 

C(q, a, 8) = max I(U; Y) - I(U; S) 

Pux\3-E[X}<q 

where the maximization is over all conditional PMFs Pjjx\s defined over U x {0, l} 2 such that E[X] < q, 
where U is a finite set and the joint distribution of the quadruple (UXSY) is given by PsPux\sPy\xs- It 
is sufficient to restrict our attention to auxilliary alphabet hi of size 3. It is 1 3^ | + |<5| — 1. 

The capacity of the above channel when both the encoder and the decoder has access to the side 
information is given by 

C B (q,a,S)= max I(X; Y\S) = h b {5 * q) - h b {5), 

P x \s-E[X]<q 

and there is a unique capacity achieving input distribution which is given by Pjq S (0|0) = Pjq<y(0|l) = l — q. 
Hence the Gelfand-Pinsker capacity C(q,a,S) equals Cs{q, a, S) if and only if [42] the unique capacity 
achieving input distribution in the latter case can be expanded into a distribution Quxsy on the set 
hi x {0, l} 3 such that the following conditions arc satisfied: (a) = 3, (b) X is a function of (U,S), (c) 
S — Y — U, and (d) U — (X, S) — Y, and the marginal Qxsy = Pxsy- ^ e wm snow that there exists no 
such expansion by contradiction. 

Let 9 = q* (5, and without loss of generality let U = {0,1,2}. Let there exist Quxsy for a triple 
(q,a,S) G (0, i) such that the four conditions are satisfied. Since Qxsy — PxsY' we have Qsy (0,0) = 
(1 - a)(l - 9), Q SY {0, 1) = (1 - a)9, Q S y(l,0) = a0, and Q S y(1, 1) = a(l - 0). Let Q u]Y (i\0) = ft, and 
Q[/|y(i|l) = 7^ for i = 0, 1, 2. From Qsy and Qu\y and imposing the Markov chain S — Y — U, we get 
the distribution of Qsyu- Now since X is a function of (U,S), let <5x|c/s(0|00) = z , Qx\us(®\lfy = z i> 
Qx|c/s(0|20) = 2 2 , Q X \us(0\01) = Z3, Qx\us(P\U) = z 4 , and Q X |irs(0|21) = z 5 , where z % € {0,1} for 
i = 0, 1, . . . , 5. Using these we get the values for Qsyx as given in Table [TJ 
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Table 1: Distributions Qsyx and P S yx °f (SYX) 



Define 

1 - g - 5 + qd 5 - qS 

i'i = z ; — — and ip2 = 



l-q-5 + 2q5' T q + 5-2q5 

Equating Qsyx = Psyx we S e t the following two equations. 

= /3 z + P1Z1 + P2Z2 = J0Z3 + l\Zi + 72-Z5 

and 

02 = A) 23 + + P22 5 = 70Z0 + 71^1 + 72^2 

We will see whether we can find Zi for i = 0, 1, . . . , 5 that satisfy the above two equations. First let us 
make two simple observations. The condition ipi = ip2 is equivalent to the condition: q = 1 or 5 = or 
9=1. And the condition ^1+^2 < 1 is equivalent to the condition q > |, and equality holds in the latter 
if and only if either q = 1 or 5 = or 5=1. Next we can see that there are four cases to consider. 
Case 1: Two of {20, zi, 22} and two of {23, 24, 25} are zeros: We cannot have non-zero z^s sharing the 
same p^'s because of the first observation made above. So without loss of generality, let z\ = Z2 = 23 = 
Z5 = 0. Then %pi = po and -02 = pV But, since po + Pi < 1, we get the condition that g > |, which is a 
contradiction. 

Case 2: Two of {20,21,22} and one of {23,24,25} are zeros (or vice versa): Using the second 
observation made above, without loss of generality, let z\ = 22 = 2,5 = 0. This implies that ipi = po and 
02 = Po + Pi = 1 — Pi- Then p\ = 1 — /3 2 — Po = V>2 - V^i- Similarly, -0 2 = 7o and ipi =70+71 which 
imply that 71 = ipi — ipi- This implies that one of p\ and 71 must be negative, unless ipi = tp2 which leads 
to the condition q = or 5 = or 5=1. Hence a contradiction. 

Case 3: One of {20,21,22} and one of {23,24,25} are zeros: Using the first observation made above, 
without loss of generality, let 22 = 24 = 0. Then ipi = po + Pi = 1 — P2 and ip2 = Pa + P2 = 1 — Pi-, and 
similarly, x/ji = 1 — 71 and -02 = 1 — 72- Hence we have 71 = P2 and 72 = p\. What do we do next? At 
this point, the Markov chain U — (AS 1 ) — Y comes to our rescue. Let us look at the joint probability of 
(Y,U) conditioned on the event (X, S) = (0,0) as given in Tabled] Enforcing the Markov chain, we get 
I7 = This along with the relation /3i = 72 and p2 =71, imply p\ = or pi + ^2 = 1 . The first gives 
us ipi — 1P2 and leads to 5 = or q = 1 or 5 = 1. The second leads to 5 = or 5 = 1 or q = h. Hence a 
contradiction. 
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Table 2: Conditional distribution of (Y, U) given (S, X) = (0, 0). 

Case 4: All of {zq, . . . , z$} are zeros: In this case we get ipi = ip2 = 0. This implies that q = 1 or 6 = 1 
and 8 = or q = 1. Hence a contradiction again. 

□ 
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